QR Decomposition in a Multicore Environment

نویسنده

  • Omar Ahsan
چکیده

In this study we examine performance benefits of implementing the QR decomposition in a way that takes advantage of multiple processes or threads. This is done by partitioning the matrix into blocks of a certain number of rows, which is called the blocksize. We examine this algorithm on “tall and skinny” matrices, which are matrices that have a very large number of rows, but comparatively fewer columns. These matrices are very important, as one of their most common uses is Linear Regression, which is a tool used in many different fields. We also compare this implementation to one which uses a MapReduce environment to compute the QR decomposition. We find that partitioning the matrix and using multiple processes to compute the QR decomposition in parallel provides for computing the decomposition much faster than computing the QR decomposition immediately on the original matrix.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multifrontral multithreaded rank-revealing sparse QR factorization

SuiteSparseQR is a sparse multifrontal QR factorization algorithm. Dense matrix methods within each frontal matrix enable the method to obtain high performance on multicore architectures. Parallelism across different frontal matrices is handled with Intel’s Threading Building Blocks library. Rank-detection is performed within each frontal matrix using Heath’s method, which does not require colu...

متن کامل

Fine Granularity Sparse QR Factorization for Multicore Based Systems

The advent of multicore processors represents a disruptive event in the history of computer science as conventional parallel programming paradigms are proving incapable of fully exploiting their potential for concurrent computations. The need for different or new programming models clearly arises from recent studies which identify fine-granularity and dynamic execution as the keys to achieve hi...

متن کامل

An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs

The tile QR factorization provides an efficient and scalable way for factoring a dense matrix in parallel on multicore processors. This article presents a way of efficiently implementing the algorithm on a system with a powerful GPU and many multicore CPUs.

متن کامل

Two-Stage Least Squares Algorithms with QR Decomposition for Simultaneous Equations Models on Heterogeneous Multicore and Multi-GPU Systems

G21 Z̃22 Z̃23 Z̃24 W̃21 G31 Z̃32 Z̃33 Z̃34 W̃31 G41 G42 Z̃43 Z̃44 W̃41 G51 G52 Z̃53 Z̃54 W̃51 Z11 Z12 Z13 Z14 W11 G21 Z̃22 Z̃23 Z̃24 W̃21 G31 Z̃32 Z̃33 Z̃34 W̃31 G41 G42 Z̃43 Z̃44 W̃41 G51 G52 Z̃53 Z̃54 W̃51 Two-Stage Least Squares algorithms with QR decomposition for Simultaneous Equations Models on heterogeneous multicore and multi-GPU systems Carla Ramiroa, José J. López-Espínb, Domingo Giménezc and Antonio M. Vidala

متن کامل

Fully Empirical Autotuned QR Factorization For Multicore Architectures

Tuning numerical libraries has become more difficult over time, as systems get more sophisticated. In particular, modern multicore machines make the behaviour of algorithms hard to forecast and model. In this paper, we tackle the issue of tuning a dense QR factorization on multicore architectures. We show that it is hard to rely on a model, which motivates us to design a fully empirical approac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014